Combining bibliometrics, information retrieval, and relevance theory, Part 1: First examples of a synthesis
نویسنده
چکیده
In Sperber and Wilson's relevance theory (RT), the ratio Cognitive Effects/Processing Effort defines the relevance of a communication. The tf*idf formula from information retrieval is used to operationalize this ratio for any item co-occurring with a user-supplied seed term in bibliometric distributions. The tf weight of the item predicts its effect on the user in the context of the seed term, and its idf weight predicts the user's processing effort in relating the item to the seed term. The idf measure , also known as statistical specificity, is shown to have unsuspected applications in quantifying interrelated concepts such as topical and nontopical relevance , levels of user expertise, and levels of authority. A new kind of visualization, the pennant diagram, illustrates these claims. The bibliometric distributions visualized are the works cocited with a seed work (Moby Dick), the authors cocited with a seed author (White HD, for maximum interpretability), and the books and articles cocited with a seed article (S.A. Harter's " Psychological Relevance and Information Science, " which introduced RT to information scientists in 1992). Pennant diagrams use bibliometric data and information retrieval techniques on the system side to mimic a relevance-theoretic model of cognition on the user side. Relevance theory may thus influence the design of new visual information retrieval interfaces. Generally, when information retrieval and bibliometrics are interpreted in light of RT, the implications are rich: A single sociocognitive theory may serve to integrate research on literature-based systems with research on their users, areas now largely separate. Introduction In this article, I integrate ideas from bibliometrics, information retrieval, and Sperber and Wilson's influential book, Relevance: Communication and Cognition. The synthesis is quite general, and its validity may be tested by anyone with access to standard bibliometric counts, such as those available in many databases on Dialog. When rank-ordered, these counts form highly skewed distributions called, among other things, empirical hyperbolic, core-and-scatter, scale-free, power-law, and reverse J. Whatever the name, I show that when the terms in any bibliometric distribution are treated as components of the well-known tf*idf formula from information retrieval, those terms are interpretable as what Sperber and Wilson (S&W) have called assumptions relevant in a context. The context is the seed term from which the bibliometric distribution was created, and the following definitions hold (Sperber & Wilson, 1986, 1995, p. 125): " An assumption is relevant in a context to the extent that its contextual …
منابع مشابه
Combining bibliometrics, information retrieval, and relevance theory, Part 2: Some implications for information science
(tf) and inverse document frequency (idf) values, plotted as pennant diagrams, and interpreted according to Sper-ber and Wilson's relevance theory (RT), the results evoke major variables of information science (IS). These include topicality, in the sense of intercohesion and in-tercoherence among texts; cognitive effects of texts in response to people's questions; people's levels of expertise a...
متن کاملSemiautomatic Image Retrieval Using the High Level Semantic Labels
Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...
متن کاملMatching Scores of System Relevance and User-Oriented Relevance in SID, ISC and Google Scholar
Background and Aim: The main aim of Information storage and retrieval systems is keeping and retrieving the related information means providing the related documents with users’ needs or requests. This study aimed to answer this question that how much are the system relevance and User- Oriented relevance are matched in SID, SCI and Google Scholar databases. Method: In this study 15 keywords of ...
متن کاملDocument Image Retrieval Based on Keyword Spotting Using Relevance Feedback
Keyword Spotting is a well-known method in document image retrieval. In this method, Search in document images is based on query word image. In this Paper, an approach for document image retrieval based on keyword spotting has been proposed. In proposed method, a framework using relevance feedback is presented. Relevance feedback, an interactive and efficient method is used in this paper to imp...
متن کاملThe socio - cognitive theory in information retrieval (IR)
Abstract Background and Aim: The socio-cognitive theory introduced in information science by Horland and Alberchtsen. The socio-cognitive view turns the traditional cognitive program upside down. The socio-cognitive theory emphasizes on different cultural and social structures of users. Hence, the aim of the article is to explain the role of socio - cognitive theory in information retrieval (I...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JASIST
دوره 58 شماره
صفحات -
تاریخ انتشار 2007